Performance Comparison of Learning to Rank Algorithms for Information Retrieval

نویسندگان

  • Ridho Reinanda
  • Dwi H. Widyantoro
چکیده

Learning to rank is the problem of ranking objects by using machine learning techniques. One of the applications of learning to rank is for ranking document of search results. In this research, we compare the performance of three learning to rank algorithms: RankSVM, LambdaMART, and Additive Groves. RankSVM, which is ranking variant of the classical SVM algorithm, is commonly used as a baseline in learning to rank experiments. LambdaMART and Additive Groves is both tree ensembles algorithm. They belong to the class of algorithms that yield top results in the recent Yahoo! Learning to Rank Challenge. The comparison is performed by evaluating the results algorithms to a standard dataset. We also study the outcome of performing feature selection for final algorithm performance. Two feature selection techniques based on the filter and wrapper approaches are implemented. The experiment is conducted with the LETOR 4.0 dataset. This dataset consists of 46 features, such as: TF, BM25, LMIR, PageRank scores, etc. Parameter tuning is performed beforehand in order to find the best parameters for reliable comparison. The evaluation metric is Normalized Discounted Cumulative Gain (NDCG), a graded relevance measure for ranking results. The experimental results reveal that Learning to Rank algorithms outperform the conventional algorithms. In addition, applying feature selections improves the algorithm performances. Keywords—AAAZ learning to rank algorithms; RankSVM; Lambda MART; Additive Group; Feature Selection.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ارائه الگوریتمی مبتنی بر یادگیری جمعی به منظور یادگیری رتبه‌بندی در بازیابی اطلاعات

Learning to rank refers to machine learning techniques for training a model in a ranking task. Learning to rank has been shown to be useful in many applications of information retrieval, natural language processing, and data mining. Learning to rank can be described by two systems: a learning system and a ranking system. The learning system takes training data as input and constructs a ranking ...

متن کامل

Chaotic Genetic Algorithm based on Explicit Memory with a new Strategy for Updating and Retrieval of Memory in Dynamic Environments

Many of the problems considered in optimization and learning assume that solutions exist in a dynamic. Hence, algorithms are required that dynamically adapt with the problem’s conditions and search new conditions. Mostly, utilization of information from the past allows to quickly adapting changes after. This is the idea underlining the use of memory in this field, what involves key design issue...

متن کامل

An Effective Approach for Robust Metric Learning in the Presence of Label Noise

Many algorithms in machine learning, pattern recognition, and data mining are based on a similarity/distance measure. For example, the kNN classifier and clustering algorithms such as k-means require a similarity/distance function. Also, in Content-Based Information Retrieval (CBIR) systems, we need to rank the retrieved objects based on the similarity to the query. As generic measures such as ...

متن کامل

Effective Learning to Rank Persian Web Content

Persian language is one of the most widely used languages in the Web environment. Hence, the Persian Web includes invaluable information that is required to be retrieved effectively. Similar to other languages, ranking algorithms for the Persian Web content, deal with different challenges, such as applicability issues in real-world situations as well as the lack of user modeling. CF-Rank, as a ...

متن کامل

Cost-Sensitive Support Vector Ranking for Information Retrieval

In recent years, the algorithms of learning to rank have been proposed by researchers. However, in information retrieval, instances of ranks are imbalanced. After the instances of ranks are composed to pairs, the pairs of ranks are imbalanced too. In this paper, a cost-sensitive risk minimum model of pairwise learning to rank imbalanced data sets is proposed. Following this model, the algorithm...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014